Skip to content

Conversation

@HuijingHei
Copy link
Member

@HuijingHei HuijingHei commented Nov 5, 2025

There is a race condition for ostree.sync and kdump.crash.nfs to bind port 2049, add retry if there is.
Fixes: coreos/rhel-coreos-config#89

@HuijingHei HuijingHei changed the title qemu.go: add sleep if there is a race condition to bind port 2049 qemu.go: retry if there is a race condition to bind port 2049 Nov 5, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism with a delay to handle a race condition when binding to a port. The change is logical and addresses the issue. I have one suggestion to improve the code's readability and maintainability by using constants for retry parameters and adopting the project's standard logger.

Comment on lines 650 to 661
maxRetries := 12
for attempt := 1; attempt <= maxRetries; attempt++ {
l, err = net.Listen("tcp", address)
if err == nil {
l.Close()
break
}

fmt.Printf("Failed to listen on %s: %v, retrying (%d/%d)...\n",
address, err, attempt, maxRetries)
time.Sleep(time.Duration(5) * time.Second)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This retry logic is a good addition. To improve its readability and maintainability, I have a few suggestions:

  • Define maxRetries and the sleep duration as constants. This makes them easier to find and change.
  • The explicit type conversion time.Duration(5) is redundant since 5 is an untyped constant.
  • For consistency with the rest of the project, it would be better to use plog.Warningf for logging instead of fmt.Printf.
                const maxRetries = 12
		const retryDelay = 5 * time.Second
		for attempt := 1; attempt <= maxRetries; attempt++ {
			l, err = net.Listen("tcp", address)
			if err == nil {
				l.Close()
				break
			}

			plog.Warningf("Failed to listen on %s: %v, retrying (%d/%d)...",
				address, err, attempt, maxRetries)
			time.Sleep(retryDelay)
		}

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry loop to handle a race condition when binding to a network port, which is a sensible approach to the problem described. My feedback focuses on improving the code's readability and consistency with the project's existing style, specifically by using named constants for retry parameters and adhering to the established logging framework.

Comment on lines 650 to 661
maxRetries := 12
for attempt := 1; attempt <= maxRetries; attempt++ {
l, err = net.Listen("tcp", address)
if err == nil {
l.Close()
break
}

fmt.Printf("Failed to listen on %s: %v, retrying (%d/%d)...\n",
address, err, attempt, maxRetries)
time.Sleep(time.Duration(5) * time.Second)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

A few suggestions to improve this retry logic:

  1. It's good practice to define magic numbers like 12 and 5 as constants with descriptive names. This improves readability and maintainability.
  2. The logging should use the package logger plog instead of fmt.Printf for consistency. plog.Infof seems appropriate for a retry attempt.
  3. time.Duration(5) * time.Second can be simplified to 5 * time.Second.
Suggested change
maxRetries := 12
for attempt := 1; attempt <= maxRetries; attempt++ {
l, err = net.Listen("tcp", address)
if err == nil {
l.Close()
break
}
fmt.Printf("Failed to listen on %s: %v, retrying (%d/%d)...\n",
address, err, attempt, maxRetries)
time.Sleep(time.Duration(5) * time.Second)
}
const maxRetries = 12
const retryDelay = 5 * time.Second
for attempt := 1; attempt <= maxRetries; attempt++ {
l, err = net.Listen("tcp", address)
if err == nil {
l.Close()
break
}
plog.Infof("Failed to listen on %s: %v, retrying (%d/%d)...", address, err, attempt, maxRetries)
time.Sleep(retryDelay)
}

There is a race condition for `ostree.sync` and `kdump.crash.nfs`
to bind port 2049, add retry if there is, instead of retrun err
immediately.
Fixes: coreos/rhel-coreos-config#89
@HuijingHei
Copy link
Member Author

/retest

@dustymabe
Copy link
Member

see #4117

@HuijingHei
Copy link
Member Author

Close this as there is a better workaround in #4361

@HuijingHei HuijingHei closed this Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[c9s] ostree.sync occasionally failed

3 participants